Transferring knowledge from a RNN to a DNN
نویسندگان
چکیده
Deep Neural Network (DNN) acoustic models have yielded many state-of-the-art results in Automatic Speech Recognition (ASR) tasks. More recently, Recurrent Neural Network (RNN) models have been shown to outperform DNNs counterparts. However, state-of-the-art DNN and RNN models tend to be impractical to deploy on embedded systems with limited computational capacity. Traditionally, the approach for embedded platforms is to either train a small DNN directly, or to train a small DNN that learns the output distribution of a large DNN. In this paper, we utilize a state-of-the-art RNN to transfer knowledge to small DNN. We use the RNN model to generate soft alignments and minimize the Kullback-Leibler divergence against the small DNN. The small DNN trained on the soft RNN alignments achieved a 3.9 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4.6 WER or more than 13% relative improvement.
منابع مشابه
RNN-LDA Clustering for Feature Based DNN Adaptation
Model based deep neural network (DNN) adaptation approaches often require multi-pass decoding in test time. Input feature based DNN adaptation, for example, based on latent Dirichlet allocation (LDA) clustering, provide a more efficient alternative. In conventional LDA clustering, the transition and correlation between neighboring clusters is ignored. In order to address this issue, a recurrent...
متن کاملSemi-Supervised Training in Deep Learning Acoustic Model
We studied the semi-supervised training in a fully connected deep neural network (DNN), unfolded recurrent neural network (RNN), and long short-term memory recurrent neural network (LSTM-RNN) with respect to the transcription quality, the importance data sampling, and the training data amount. We found that DNN, unfolded RNN, and LSTM-RNN are increasingly more sensitive to labeling errors. For ...
متن کاملExploring robustness of DNN/RNN for extracting speaker baum-welch statistics in mismatched conditions
This work explores the use of DNN/RNN for extracting Baum-Welch sufficient statistics in place of the conventional GMM-UBM in speaker recognition. In this framework, the DNN/RNN is trained for automatic speech recognition (ASR) and each of the output unit corresponds to a component of GMM-UBM. Then the outputs of network are combined with acoustic features to calculate sufficient statistics for...
متن کاملComparative Study of CNN and RNN for Natural Language Processing
Deep neural networks (DNNs) have revolutionized the field of natural language processing (NLP). Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), the two main types of DNN architectures, are widely explored to handle various NLP tasks. CNN is supposed to be good at extracting positioninvariant features and RNN at modeling units in sequence. The state-of-the-art on many NLP ...
متن کاملTTS synthesis with bidirectional LSTM based recurrent neural networks
Feed-forward, Deep neural networks (DNN)-based text-tospeech (TTS) systems have been recently shown to outperform decision-tree clustered context-dependent HMM TTS systems [1, 4]. However, the long time span contextual effect in a speech utterance is still not easy to accommodate, due to the intrinsic, feed-forward nature in DNN-based modeling. Also, to synthesize a smooth speech trajectory, th...
متن کامل